Reproducible and Accurate Matrix Multiplication

نویسندگان

  • Roman Iakymchuk
  • David Defour
  • Sylvain Collange
  • Stef Graillat
چکیده

Due to non-associativity of floating-point operations and dynamic scheduling on parallel architectures, getting a bit-wise reproducible floating-point result for multiple executions of the same code on different or even similar parallel architectures is challenging. In this paper, we address the problem of reproducibility in the context of matrix multiplication and propose an algorithm that yields both reproducible and accurate results. This algorithm is composed of two main stages: a filtering stage that uses fast vectorized floating-point expansions in conjunction with error-free transformations; an accumulation stage based on Kulisch long accumulators in a high-radix carry-save representation. Finally, we provide implementations and performance results in parallel environments like GPUs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Parallel Matrix Multiplication Method Adapted on Fibonacci Hypercube Structure

The objective of this study was to develop a new optimal parallel algorithm for matrix multiplication which could run on a Fibonacci Hypercube structure. Most of the popular algorithms for parallel matrix multiplication can not run on Fibonacci Hypercube structure, therefore giving a method that can be run on all structures especially Fibonacci Hypercube structure is necessary for parallel matr...

متن کامل

Algebraic adjoint of the polynomials-polynomial matrix multiplication

This paper deals with a result concerning the algebraic dual of the linear mapping defined by the multiplication of polynomial vectors by a given polynomial matrix over a commutative field

متن کامل

GEMMbench: a framework for reproducible and collaborative benchmarking of matrix multiplication

The generic matrix-matrix multiplication (GEMM) is arguably the most popular computational kernel of the 20th century. Yet, surprisingly, no common methodology for evaluating GEMM performance has been established over the many decades of using GEMM for comparing architectures, compilers and ninja-class programmers. We introduce GEMMbench, a framework and methodology for evaluating performance o...

متن کامل

SHORT-SS4: Error-Free Transformation of Matrix Multiplication by A Posteriori Verification

This paper is concerned with accurate computations for matrix multiplication. An error-free transformation of matrix multiplication is developed by the authors. It transforms a product of two floatingpoint matrices to a sum of several floating-point matrices by using only floating-point arithmetic. This transformation is useful not only for accurate matrix multiplication but also for interval e...

متن کامل

Acceleration of a Preconditioning Method for Ill-Conditioned Dense Linear Systems by Use of a BLAS-based Method

We are interested in accurate numerical solutions of ill-conditioned linear systems using floating-point arithmetic. Recently, we proposed a preconditioning method to reduce the condition numbers of coefficient matrices. The method utilizes an LU factorization obtained in working precision arithmetic and requires matrix multiplication in quadruple precision arithmetic. In this note, we aim to a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014